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Abstract 

The notion of divergence information of an ensemble of probability distributions was in- 
troduced by Jain, Radhakrishnan, and Sen [3 [7] in the context of the "substate theorem" . 
Since then, divergence has been recognized as a more natural measure of information in several 
situations in quantum and classical communication. 

We construct ensembles of probability distributions for which divergence information may 
be significantly smaller than the more standard Holevo information. As a result, we establish 
that lower bounds previously shown for Holevo information are weaker than similar ones shown 
for divergence information. 
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1 Introduction 



In this article, we study the relationship between two different measures of information contained 
in an ensemble of probability distributions. The first measure, Holevo information^ is a standard 
notion from information theory, and is equivalent to the notion of mutual information between 
two random variables. Consider jointly distributed random variables XY , with X taking values 
in a sample space X. Consider the ensemble of distributions £ = {(Aj,!^) : i G X}, where Aj = 
Pr(X = i), and Yi = Y\{X = i), obtained by conditioning on values assumed by X. The Holevo 
information of the ensemble is given by x(^) = ^{X ■ Y) = Ej^xS(li||5^), where S(-||-) measures 
the relative entropy of a random variable (equivalently, distribution) with respect to another. This 
notion may be extended to ensembles of quantum states (see, e.g., the text ^lT\), and the term 
'Holevo information' is derived from the literature in quantum information theory. 

The second measure, divergence information, was introduced by Jain, Radhakrishnan, and Sen [SJ 
[7]. It arises in the study of relative entropy, and its connection with a "substate property". 
The observational divergence of two classical distributions P, Q on the same finite sample space 
is maxE P{E)log2{P{E)/Q{E)), where E ranges over all events. We may view this as a (scaled) 
measure of the factor by which P may exceed Q for an event of interest. The notion of divergence 
information is derived from this as D(£') = Ej^xD(^||^)5 in analogy with Holevo information. A 
quantum generalisation of this measure may also be defined [7j. 

Relative entropy and Holevo (or mutual) information have been studied extensively in communica- 
tion theory and beyond (see, e.g, [2) as they arise in a variety of applications. Since the discovery 
of the substate theorem 0, divergence is being recognized as a more natural measure of informa- 
tion in a growing number of applications [7, Section 1]. The applications include privacy trade-offs 
in communicatioin protocols for computing relations [6] and bit-string commitment [3j, and the 
communication complexity of remote state preparation [3]. In particular, divergence captures, up 
to a constant factor, the substate property for probability distributions. It thus becomes relevant 
in every application where the substate theorem is used. 

We construct ensembles of probability distributions (equivalently, jointly distributed random vari- 
ables) for which the Holevo and divergence information are quantitatively different. 

Theorem 1.1 For every positive integer N , and real number k such that N > 2^^^^ , there is 
an ensemble £ of distributions over a sample space of size N such that D(<S) = k and xi^) = 
G(A;loglogiV). 

A more precise statement of this theorem (Theorem 13. ip and related results may be found in 
Section [3l 

The ensembles we construct satisfy the property that the ensemble average (i.e., the distribution of 
the random variable Y in the description above) is uniform. We show that the above separation is 
essentially the best possible whenever the ensemble average is uniform (Theorem 13. 5p . The result 
also applies to ensembles of quantum states, where the ensemble average is the completely mixed 
state (Theorem 13. 6p . We leave open the possibility of larger separations for classical or quantum 
ensembles with non-uniform averages. 

The difference between the two measures demonstrated by Theorem 11.11 shows that in certain 
applications, divergence is quantitatively a more relevant measure of information. In Section |Al we 
describe two applications where functionally similar lower bounds have been established in terms 
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of both measures. This article shows that the lower bounds in terms of divergence information are, 
in fact, stronger. 

In prior work on the subject, Jain et al. [71 Appendix A] compare relative entropy and divergence 
for classical as well as quantum states. For pairs of distributions P, Q over a sample space of 
size N, they show that D(P||Q) < S(P||Q) + 1, and S(P||Q) < D(P||Q) • {N - 1). This extends 
to the corresponding measures of information in an ensemble: < x{^) + 1 x{^) ^ 

D{£) ■ {N — 1). They show qualitatively similar relations for ensembles of quantum states. In 
addition, they construct a pair of distributions P, Q such that S(i-*||(5) = @(D(P\\Q) ■ N). However, 
their construction does not appear to translate to a similar separation for ensembles of probability 
distributions. Our work fills this gap for ensembles (of classical or quantum states) with a uniform 
average. 

2 Preliminaries 

Here, we summarise our notation and the information-theoretic concepts we encounter in this 
work. We refer the reader to the text by Cover and Thomas [2j for a deeper treatment of (classical) 
information theory. While the bulk of this article pertains to classical information theory, as 
mentioned in Section[Tl it is motivated by studies in (and has implications for) quantum information. 
We refer the reader to the text [TT] for an introduction to quantum information. 

For a positive integer N, let [N] represent the set {!,... ,N}. We view probability distributions 
over [A^] as vectors in . The probability assigned by distribution P to a sample point i G [N] is 
denoted by pi (i.e., with the same letter in small case). We denote by P^ the distribution obtained 
from P by composing it with a permutation vr on [N] so that = and p| > P2 — " " " — Pn- 
For an event E C [N], let P{E) = '^^^^Pi denote the probability of that event. We denote the 
uniform distribution over [N] by Uat. 

We appeal to the majorisation relation for some of our arguments. The relation tells us which of 
two given distributions is "more random" . 

Definition 2.1 (Majorisation) Let P,Q be distributions over [N]. We say that P majorises Q, 
denoted as P Q, if 



for all ie[N]. 

The following is straightforward. 

Fact 2.1 Any probability distribution P on [N] majorises Uat, the uniform distribution over [N]. 

Throughout this article, we use 'log' to denote the logarithm with base 2, and 'In' to denote the 
logarithm with base e. 

Definition 2.2 (Entropy, relative entropy) Let P,Q be probability distributions on [N]. The 
entropy of P is defined as H(P) — X^^^ log . The relative entropy between P,Q, denoted 
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S(P||Q), is defined as 



N 

S{P\\Q) Y^P^log^ 



Note that the relative entropy with respect to the uniform distribution is connected to entropy as 
S(P||U7v) =logA^-H(P). 

We can formaUse the connection between majorisation and randomness through the following fact. 



Fact 2.2 If P,Q are distributions over [N] such that P majorises Q, i.e. P >: Q, then H(P) < 
H(Q). 

The notion of observational divergence was defined by Jain, Radhakrishnan, and Sen [5j in the 
context of the "substate theorem" . 

Definition 2.3 (Observational divergence) Let P,Q be probability distributions on [N]. Then 
the observational divergence between them, denoted D{P\\Q), is defined as 

Throughout the paper we refer to 'observational divergence' as simply 'divergence'. 

Divergence is always non-negative, and the divergence of any distribution with respect to the 
uniform distribution is bounded. 

Lemma 2.3 For any probability distribution P on [N], we have < D(P||Uiv) < logA^. 

Proof: Consider the event E which achieves the divergence between P and Vn- W.l.o.g., the 
event E is non-empty. Therefore P{E) > \Jn{E) > 1/N, and < D{P\\Un) < P{E)logP{E)N < 
log A^. ■ 

We observe that we need only maximise over N events to calculate divergence with respect to the 
uniform distribution. 

Lemma 2.4 For any probability distribution P on [N] such that P^ = P, i.e., pi > p2 > ■ ■ ■ > Pn, 

we have 

D(P||Uiv) = max P([i])log^^^— ^® . 

i&[N] i 

Proof: By definition of observational divergence, the RHS above is bounded by D(P||U7v). For 
the inequality in the other direction, we note that the probability P{E) of any event E with 
size ue = \E\ \s bounded by P{[nE\), the probability of the first elements in [N]. We thus have 

D(P||Q) = max P{E) log ^ ' ^^^^ 

[N] n E 

N-Pi[nE]) 



< max P{E) log ■ 



EC[N] UE 

^„ N-Pi\nE\) 

< maxPfn^; log \LJ^, 

EQ[N] UE 



since P majorises Utv (Fact [2TT]) and P{[nE]) > This is equivalent to the RHS in the statement 
of the lemma. ■ 

Definition 2.4 (Ensemble) An ensemble is a sequence of pairs {{Xj,Qj) : j G [M]}, for some 
integer M , where Let A = (Aj) G M}'^ is a probability distribution on [M] and Qj are probability 
distributions over the same sample space. 

Definition 2.5 (Holevo information) The Holevo information of an ensemble E = {(pj,Qj) : 
j G [M]}, denoted as xi^)> is defined as 

M 

x{£) = J;a,s(q,||q), 

where Q = YljLi ^jQj ^-^ the ensemble average. 

Definition 2.6 (Divergence information) T/ie divergence information of an ensemble £ = {{pj,Qj) 
j G [^'^]}) denoted as is defined as 

M 

B{£) ^A,D(Q,||Q), 
where Q = YljLi ^jQj ^-^ ihe ensemble average. 

3 Divergence versus relative entropy 

In this section, we describe the construction of an ensemble for which there is a large separation 
between divergence and Holevo information. The ensemble has the property that the ensemble 
average is uniform. As a by-product of our construction, we also obtain a bound on the maximum 
possible separation for ensembles with a uniform average. 

We begin with the construction of the ensemble. Let fi^{k,N) = A;(lnlog(A;A^) — ln(6A;) + l) — log(l-|- 
/^ln2) — 1 — on point in the positive orthant in with Nk > 1. 

Theorem 3.1 For every integer N > 1, and every positive real number ^ < k < logN, there 
is an ensemble £ = {{jq,Qi) ■ i G [A^]} with jjJ^iQi — Uat, the uniform distribution over [N], 
with G{£) < k, and 

X{£) > h{k,N). 

To construct the ensemble described in the theorem above, we first construct a probability distribu- 
tion P on [A^] with observational divergence D(P||UAr) < k such that its relative entropy S(P||U7v) 
is large as compared with k. Let /u = k{]x].\og{Nk) — Ink + l) be defined on points in the positive 
orthant of ^ith kN > I. 

Theorem 3.2 For every integer N > 1, and every positive real number ^ < k < log A^, there is a 
probability distribution P with D(P||UAr) = k, and 

fUk,N) < S{P\\\Jn) < fv{k,N). 
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The construction of the ensemble is now immediate. 

Proof of Theorem I3.lt Let Qj = P o iij, where ttj is the cychc permutation of [N] by j — 1 
places. We endow the set of the N cyclic permutations {Qj : j S [N]} of P with the uniform 
distribution. By construction, the ensemble average is Utv- Since both observational divergence 
and relative entropy with respect to the uniform distribution are invariant under permutations of 
the sample space, B{£) = D(P||Uiv) < k, and x(.£) = S(P||Uiv) > /l(A;, N). ■ 

We turn to the construction of the distribution P. Our construction is such that P^ = P, i.e., 
Pi ^ P2 ^ ■ ■ ■ ^ Pn- Lemma 12.41 tells us that we need only ensure that 

P([i])log ^ < k, yi€[N], (1) 

to ensure D(P||(5) < k. Since S(P||UAr) = log — H(P), we wish to minimise the entropy of P 
subject to the constraints in Eq. ([1]). This is equivalent to successively maximising pi,p2, ■ ■ ., and 
motivates the following definitions. 

Define the function g{y, x) = y \og{Ny/x) — k on the positive orthant of M^. Consider the function h : 
R+ implicitly defined by the equation g{h{x),x) = 0. 

Lemma 3.3 The function h : — > is well-defined, strictly increasing, and concave. 

Proof: Fix an x G M"*", and consider the function gx{y) = g{y,x). This function is continuous 
on M"*", tends to — < as y — > 0'*', and tends to oo as y — > oo. By Intermediate Value Theorem, 
for some y > 0, we have gx{y) = 0. Moreover, gxiy) < —k for < y < x/N, and is strictly 
increasing for y > x/Ne (its derivative is (?^(y) = log 2^). Therefore there is a unique y such 
that gx{y) = and h{x) is well-defined. 

The function h satisfies the equation /ilog ^ = k, and therefore the identity 

X = Nhexp{-^). 
Differentiating with respect to h, we see that 

I = A.(l + ^^)exp{-^).and 

d^x _ N{kln2f /_fcln2^ 
dh^ ~ /i3 '^''P^ h )■ 

50 ^ > for all x > 0, and /i is a strictly increasing function. Note also that > for all h > 0, 
so X is a convex function of h. Since h is an increasing function, convexity of x{h) implies concavity 
of h{x). m 

Let vo = 0. For i G [N], let Vi = h{i), i.e., log ^ = k. Let Si =^ mm{l,Vi}, for i E [N]. 
Let pi = si, and pi = Si — for all 2 < i < A^. Lemma 13.31 guarantees that these numbers are 
well-defined. We claim that 

Lemma 3.4 The vector P = (pi) G defined above is a probability distribution, and P^ = P, 
i.e., pi>P2>---> PN- 

Proof: By definition, we have Vi > for all i E [N]. Therefore si = min {1, vi} > 0. Since h{x) is an 
increasing function in x, the sequence (vi) is also increasing, so (sj) is non-decreasing. Therefore = 

51 — Si-i > for i > 1. 
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Now wjvlogUiv = k > 0. Since xlogx < for x G (0,1), we have vn > 1. So sn = min{l,W7v} = 1- 
Therefore YliLiPi = = 1. So P is a probabihty distribution on [N]. 

Note that (^2/2) log(A^'t;2/2) = k/2 < k, so vi > V2/2. So si > S2/2 ^ pi > P2- For z > 2, we 
have Pi — = (sj — — (sj+i — Sj) = 2sj — — Sj+i. Since is concave, so is the 
function mm{l, h{x)} . Therefore, Si > (si_i + Si+i)/2, and the sequence (pi) is non-decreasing. ■ 

The vector S = (sj) G thus represents the (cumulative) distribution function corresponding 
to P. 

Proof of Theorem 13. 2t We claim that the probability distribution P constructed above satisfies 
the properties stated in the theorem. 

Since P^ = P, by Lemma 12.41 we need only verify that Silog{Nsi/i) < k for i G [N]. If Si = Vi, 
then the condition is satisfied with equality. (Note that since k < logA^, we have si = vi < 1.) 
Else, Si = 1 < Vi, so Silog{Nsi/i) < Vi\og{Nvi/i) = k. 

We now bound the relative entropy S(P||UAr) from above. Let n be the smallest positive integer 
such that Vn-i < 1 and Vn > 1- Note that n > 1. We also have n < N, since vn > 1 (as 
Wat log Vat = A; > 0). Therefore, we have Si = Vi (equivalently, Nsi = i2^^^^) for i G [n — 1], 
and s„ = 1 < Vn- Thus, for 1 < ? < n, 

Npi = i2^r -{i- 1)2^ 

= 2^ + (i - 1)(2^ - 2^) 

fc fc fc fc 

= 2~ + (i - 1)2^(2^"^ - 1) 

= 2'^ + Nsi-i{2^~~ -1) 

i f k k \ ^ 
> 2h +Nsi-i ln2 

\Si Si-lJ 

= 2t-^ln2. 

Si 

The penultimate line follows from the inequality 2^ > 1 + x In 2 for all x G M. Thus we have 

A 

Npi > r • (2) 

^ 1 + Aln2 ^ ^ 

Si 

_k_ 

Since Npi = Nsi = 2=1 , this also holds for i = 1. 
We bound the relative entropy using Eq. ([2]). 



N n 

^{P\\^n) = ^p^logTVpi = J^paogTVpi 
i=l i=l 
n-1 „i 

> 22 Pi 1 — TT^ ^" 

i=l ~^ 7^ ^ 



"^'^ v k / A:ln2\ 
> V — -Vpilog 1 + ]+Pn\ogNpn. (3) 

We bound each of the three terms in the RHS of Eq. ^ separately. 
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We start with Yld=i Let p = pi-, and let 



m 



For every j G [m], there is an i G [n], 



say i = ij, such that jp < Si^ < {j + l)p. (Otherwise, for some i > 1, the probabihty Pi = Si — Sj_i 
is strictly larger than p, an impossibility.) 



1/s. 



1/s^ 



1/s. 



S4 Sg Sg 



2p 3p 



4p 



We interpret the sum Y17='2 T' ~ "^^=2 ^^"^''^ as a Riemann sum approximating the area under 
the curve 1/x between si and Sn-i with the area under the solid lines in Figure [31 This area is 
bounded from below by the area under the dashed lines, which corresponds to the area of rectangles 
of uniform width p and height l/sj+i for the j'th interval. Thus, 



n— 1 , m ^ 



> k + k 



i=l 3=1 



> k + ^y^ff ■ 



(j + 2)p 



m ^ 

> k + k —dx 
J3 X 

TYli ~\~ 3 

= k + kln . (4) 



We lower bound m 



next. Recall that gi{y) = y\og{Ny) is an increasing function for y > -4 



and p = pi > 1/N. Consider the value of gi{y) at the point (7=1^ 



2k 



ig kN' 



2k ^ 2Nk , / loglogA;iV\ 

= b^^°^foiiiv > 'H'"^^^' - ' 

since kN > 16. As gi{q) > gi{p) > 0, we have q > p. Therefore, m > | — 1 > ^°|^^ — 1. Together 



with Eq. we get 



n— 1 , 

V— > A;(lnlogA;A^-ln6/fc + l). (5) 

'H Si 

1=1 



Next, we derive a lower bound for the second term in Eq. 



( A;ln2\ 

^PifogflH — j = - fog(gi + In 2) + fog Sj 

> -fog(l + A;ln2) + ^p,fogSi. (6) 



i=l 

Viewing the second term above as a Riemann sum, we get 

n—1 -, 



V] -pi fog S j > / fog X dx 





fl 

> / fog X dx 



1 

'fo2' 



(7) 



Combining Eq. ([6|) and d?]) , we get 

n-l 



"^Pi fogf 1 + 
i=l ^ 



^ln2\ , , , , X 1 

> -fog(l + A;ln2)-— . (8) 



We bound the third term in Eq. ([3|) crudely as pn^ogNpn > —1. Along with the bounds for the 
previous two terms, Eq. ([5|), ([8]), this shows that 

SfPllUjv) > fL{k,N) =^ A;(lnlogA:A^-ln6A: + l)-log(l + A;ln2)-l-— . (9) 

m 2 

This proves the lower bound on the relative entropy. 
Moving to an upper bound, we have for i > 2, 

Npi = ^2'^ - {i- 1)2^^-1 

= 2^ + (i - 1)(2^ - 2^) 
< 

since the second term is negative. This also holds for i = 1, since pi = si and silogA^si = k. 
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Therefore, 



S(P||Uiv) = ^PilogiVpi 

i=l 



i=l 

< k + k I -ds 



= k — klnsi 
< k + kln{^^ 
= /c(l -lnA; + ln(logiVA;)). 
In the last inequahty, we used the lower bound si > k/ log Nk. ■ 

The upper and lower bounds on the relative entropy of P with respect to the uniform distribution 
both behave as k log log Nk up to constant factors. 

Proof of Theorem ll.lt The dominating term in both of lower bound and upper bound on 
the relative entropy S(i-*||UAr), is klnlog Nk when N is large as compared with k. Specifically, 
when > 2^^^ , we have 

^kloglog Nk < S{P\\Vn) < 2kloglog Nk. 

Since /c < logA^ (Lemma l2.3p . S(P||UAr) = B(D(P||(5) log log A^). The same holds for the ensembles 
constructed in Theorem 13.11 ■ 

The separation we demonstrated above is the best possible for ensembles of distributions that have 
a uniform average distribution. 

Theorem 3.5 For any positive integer N, and any ensemble £ = {{Xj,Qj) : j £ [M]} of distri- 
butions over [N] such that J2j^Li ^jQj = Uat, we have 

Xi£) < is:(21nlogiV-lniv:+ 1) + 16, 

where K = T}{£). 

Proof: Let D{Qj\\VN) = kj. We show that S{Qj\\VN) < kj {2lnlog N - Inkj + 1) when kj > ^. 
When kj < we have S{Qj\\\JN) < 16. Since A;(21nlog A'' — In A; + 1) is a concave function in k, 
averaging over j with respect to the distribution A = (Aj) gives the claimed bound. 

Fix an j such that kj > ^. Let R = Qj. Note that D(i2||UAr) = kj and S(i?||UAr) = S(Qj||UAr). 
Consider the distribution P constructed as in Section [3] with k = kj. Using the notation of that 
section, we have Silog{Nsi/i) = kj for all i < n, and s„ = 1. Let ti = Yl]=i''"i- definition, 
we have tilog{Nti/i) < kj = Silog{Nsi/i). Since the function gi{y) = ylog{Ny/i) is strictly 
increasing for y > i/Ne, and ti > i/N (Fact 12. ip . we have ti < Si for i < n. Since Si = 1 for i > n, 
we have ti < Si for these i as well. In other words, P ^ R. By Fact 12.21 H(P) < H(i?) 4^ 
S(i?||U^) < S(P||Uiv). By TheoremEa S(P||U7v) < kj{\n\og{Nkj) -Inkj + 1). Since kj < logN, 
this is at most A;j(2 In log A'^ — Inkj + 1). ■ 

Finally, we observe that this is also the best separation possible for an ensemble of quantum states 
with a completely mixed ensemble average. 
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Theorem 3.6 For any positive integer N , and any ensemble £ = {{Xj, pj) : j £ [M]} of quantum 
states Pj over a Hilbert space of dimension N such that X^jli ^jPj ~ 7[' completely mixed state 
of dimension N , we have 

Xi£) < /s:(21nlogiV-ln/s:+ 1) + 16, 

where K = iy{£). 

Proof: Let Qj be the probabiUty distribution on [N] corresponding to the eigenvalues of pj. By 
definition of observational divergence for quantum states, D(Q-,||UAr) < D(pj||-^). Further, we 
have S(/9j||-^) = S((5j||UAr). We now apply the same reasoning as in the proof of Theorem 13.51 
note that the divergence of the ensemble {{Xj,Qj) : j G [M]\ is bounded by D(iS), and that the 
RHS in the statement is a non-decreasing function of K. This gives us the stated bound. (Note 
that we do not need Yl!j=i ^jQj = Uat to use the reasoning in Theorem 13.51 ) ■ 
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A Implications for quantum protocols 



A.l Quantum string commitment 

A string commitment scheme is an extension of the well-studied and powerful cryptographic 
primitive of bit commitment. In such schemes, one party, Alice, wishes to commit an entire 
string X G {0, 1}" to another party, Bob. The protocol is required to be such that Bob should 
not be able to identify the string until it is revealed by Alice. In turn, Alice should not be able 
to renege on her commitment at the time of revelation. Formally, quantum string commitment 
protocols are defined as follows [H [3] . 

Definition A.l (Quantum string commitment (QSC)) Let P = {px : x G {0, 1}"} be a prob- 
ability distribution and let B be a measure of information contained in an ensemble of quantum 
states. A (n,a,b)-B-QSC protocol for P is a quantum communication protocol between two parties, 
Alice and Bob. Alice gets an input x G {0, 1}" chosen according to the distribution P. The starting 
joint state of the qubits of Alice and Bob is some pure state independent of x. The protocol runs in 
two phases: the commit phase, followed by the reveal phase. There are no intermediate measure- 
ments during the protocol. At the end of the reveal phase, Bob measures his qubits according to a 
POVM {My : y S {0, 1}"} U {/ — My} to determine the value of the committed string by Alice 
or to detect cheating. The protocol satisfies the following properties. 

1. (Correctness) Suppose Alice and Bob act honestly. Let be the state of Bob's qubits at the 
end of the reveal phase of the protocol, when Alice gets input x. Then iyx,y) Tr Myp^ = 1 iff 
x = y, and otherwise. 

2. (Concealing property) Suppose Alice acts honestly, and Bob possibly cheats, i.e., deviates 
from the protocol in his local operations. Let Ux be the state of Bob 's qubits after the commit 
phase when Alice gets input x. Then the B information B{£) of the ensemble £ = {px, Cx} is 
at most b. In particular, this also holds when both Alice and Bob follow the protocol honestly. 

3. (Binding property) Suppose Bob acts honestly , and Alice possibly cheats. Let c £ {0, 1}" 

be a string in a special cheating register C with Alice that she keeps independent of the rest of 
the registers till the end of the commit phase. Let Tc be the state of Bob 's qubits at the end of 

def 

the reveal phase when Alice has c in the cheating register. Let qc = Tr McT^. Then 

cg{0,l}" 

The idea behind the above definition is as follows. At the end of the reveal phase of an honest 
run of the protocol Bob identifies x from px by performing the POVM measurement {My} U {/ — 
"Yliy-^y}- He accepts the committed string to be x iff the observed outcome y = x; this happens 
with probability Tr MxPx- He declares that Alice is cheating if outcome / — Ylx observed. 
Thus, at the end of an honest run of the protocol, with probability 1, Bob accepts the committed 
string as being exactly Alice's input string. The concealing property ensures that the amount 
of B information about x that a possibly cheating Bob gets is bounded by b. In 6it-commitment 
protocols, the concealing property is quantified in terms of the probability with which Bob can guess 
Alice's bit. Here we instead use different notions of information contained in the corresponding 
ensemble. The binding property ensures that when a cheating Alice wishes to postpone committing 
to a string string until after the commit phase, then she succeeds in forcing an honest Bob to accept 
her choice with bounded probability (in expectation). 
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Strong string commitment, in which both parameters a, b above are required to be 0, is impossible 
for the same reason that of strong bit-commitment protocols are impossible [101 E]. Weaker versions 
are nonetheless possible, and exhibit a trade-off between the concealing and binding properties. 
The trade-off between the parameters a and b has been studied by several researchers [SI HI [3]. 
Buhrman, Christandl, Hayden, Lo, and Wehner [1] study this trade-off both in the scenario of a 
single execution of the protocol and also in the asymptotic regime, with an unbounded number of 
parallel executions of the protocol. In the asymptotic scenario, they show the following result in 
terms of Holevo information (which is denoted by x)- 

Theorem A. 1 ([1]) LetU be an (n,ai,b)-x-QSC scheme. Letlim fs.present m parallel executions 

of (so lii = H) . Let arn represent the binding parameter of lim CLnd let a *== limm-+oo flm/''7^. 
Then, a + b>n. 

Jain [3j shows a similar trade-off result regarding QSCs, in terms of the divergence information of 
an ensemble (denoted by D). 

Theorem A. 2 ([Sj) For single execution of the protocol of an (n, a, 6)-D-QSC scheme, 

a + b + S^Jb+l + 16 > n. 

As mentioned before, for any ensemble f , divergence information is bounded by the Holevo X' 
information D(<S) < x{^) + 1- This immediately implies: 

Theorem A. 3 ([Sj) For single execution of the protocol of a (n, o, 6)-x-QSC scheme 

a + b + sW+2 + n >n. 

As Jain shows, this implies the asymptotic result due to Buhrman et al. (Theorem lA.ip . 

The separation that we demonstrate between divergence and Holevo information (Theorem 11. ip 
shows that for some ensembles over n qubits, D(£^) may be a logn larger than x{^)- For such 
ensembles the binding-concealing trade-off of Theorem IA.2I is stronger than that of Theorem lA.li 

A. 2 Privacy trade-off for two-party protocols for relations 

Let us consider two-party protocols between Alice and Bob for computing a relation / C X xy x Z. 
Jain, Radhakrishnan, and Sen |5] studied to what extent the two parties may solve / while keeping 
their respective inputs hidden from the other party. They showed the following: 

Result A. 4 ([6], informal statement) Let ^ be a product distribution onXxy. Let Qy^'^^ {f) 
represent the one-way distributional complexity of f with a single communication from Alice to 
Bob; and distributional error under fi at most 1/3. Let X and Y represent the random variables 
corresponding to Alice and Bob's inputs respectively. If there is a quantum communication protocol 
for f where Bob leaks divergence information at most b about his input Y , then Alice leaks divergence 
information at least Q,{Q'^'I^~^^ {f) /2^''^^) about her input X. Similar statement also holds with the 
roles of Alice and Bob interchanged. 

From the upper bound on the divergence information in terms of Holevo information this immedi- 
ately implies the following. 
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Result A. 5 ([6], informal statement) Let be a product distribution on X x y . Let Q^'l^~^^ {f) 
represent the one-way distributional complexity of f with a single communication from Alice to 
Bob; and distributional error under /i at most 1/3. Let X and Y represent the random variables 
corresponding to Alice and Bob's inputs respectively. If there is a quantum communication protocol 
for f where Bob leaks Holevo information at most b about his input Y , then Alice leaks Holevo 
information at least Q,{Q'^'I^~^^ {f) /2^^^^) about her input X. Similar statement also holds with the 
roles of Alice and Bob interchanged. 

It follows from Theorem 11.11 that Result IA.4I is much stronger than the second, Result IA.5I in case 
the ensembles arising in the protocol between Alice and Bob has divergence information much 
smaller than its Holevo information. 
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